Audio-visual event detection based on mining of semantic audio-visual labels

نویسندگان

  • King-Shy Goh
  • Koji Miyahara
  • Regunathan Radhakrishnan
  • Ziyou Xiong
  • Ajay Divakaran
چکیده

Removing commercials from television programs is a much sought-after feature for a personal video recorder. In this paper, we employ an unsupervised clustering scheme (CM Detect) to detect commercials in television programs. Each program is first divided into Ws-minute chunks, and we extract audio and visual features from each of these chunks. Next, we apply k-means clustering to assign each chunk with a commercial/program label. In contrast to other methods, we do not make any assumptions regarding the program content. Thus, our method is highly content-adaptive and computationally inexpensive. Through empirical studies on various content, including American news, Japanese news, and sports programs, we demonstrate that our method is able to filter out most of the commercials without falsely removing the regular program. This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c ©Mitsubishi Electric Research Laboratories, Inc., 2004 201 Broadway, Cambridge, Massachusetts 02139 Publication History: 1. First printing, TR-2004-008, March 2004 Audio-Visual Event Detection based on Mining of Semantic Audio-Visual Labels King-Shy Goh, Koji Miyahara, Regunathan Radhakrishan, Ziyou Xiong, Ajay Divakaran Mitsubishi Electric Research Labs, Cambridge, MA, USA. {goh, miyahara, regu, zxiong, ajayd }@merl.com

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...

متن کامل

A Comparison of Rule based and Distance Based Semantic Video Mining

In this paper, a subspace-based multimedia data mining framework is proposed for video semantic analysis, specifically video event/concept detection, by addressing two basic issues, i.e., semantic gap and rare event/concept detection. The proposed framework achieves full automation via multimodal content analysis and intelligent integration of distance-based and rule-based data mining technique...

متن کامل

Analysis of the No Return Point Hypothesis: The Effect of Audio and Visual Stimuli in the Fast Movements Inhibition

Background. The No Return Point hypothesis is one of the research areas that has been done in line with the motor program. In this hypothesis emphasized an inability to inhibition move after its start by the motor program. Several factors are affecting the mechanism of this inhibition. Objectives. In this study, we investigate the effects of audio and visual stimuli on blocking quick moves to ...

متن کامل

The Effect of Audio-Visual Distraction on Catheterization Pain among School-Age Children

Background: Catheterization is the most common cause of pain and distress in children, which causes physical and psychological dysfunctions and disrupts the treatment. Therefore, the control of this type of pain should be considered as a priority for nursing care. The audio-visual distraction can be used to reduce the intensity of pain. Aim: The purpose o...

متن کامل

Semantic Indexing and Multimedia Event Detection: ECNU at TRECVID 2012

1 Abstract This year we participated in two tasks: Semantic Indexing (SIN) and Multimedia Event Detection (MED). In this paper, we present our approaches and discuss the evaluation results. Semantic Indexing (SIN): For video semantic indexing, we focus on the performance improvement by using a Weighted Hamming Embedding kernel compared with traditional BoW approaches. Below are the brief descri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004